Imagine a situation where people carry a card with genetic data and insert them into ATM like machines for diagnoses and treatments. Experts in the field don't consider this as a wild dream at all. They are very much optimistic that a tie-up between bioinformatics and Pharmaceutical Industries will realize the dream of genetic data cards.
Why should there be a bond between bioinformatics and pharmaceutical industries to realize the dream of genetic data cards? The answer is simple. Drug discovery research depends heavily on bioinformatics to manage the databases of small molecules that are potential lead compounds, search databases of protein structures for structure-based drug design methods and to model the docking of compounds and their target proteins.
Recent studies point out that bio-informatics and pharmaceutical industries are in search of faster and more specific identification of 'targets' for chemical agents and superior and effective design for new drugs that will enable them to sell the products for an attractive profit.
It is axiomatic that pharmaceutical companies expend all their effort to identify even one new compound that will be the next hot thing on pharmacy shelves. They screen thousands of millions of chemicals to find one molecule that will be effective and safe for the treatment. However, it stands to reason that computational methods have narrowed the range of options the companies have to search for the development of a new compound with low side effects.
Given the pace at which diffusion of innovation rules the world, only for a fleeting time medicines will be discovered in the old trial-and-error fashion. Scientists feel that they will soon be able to design drugs almost entirely on the computer -- piecing together models of proteins with the chemical structures of potential drugs.
Structure-based drug design and lead compounds
Structure-based drug design methods involve modeling of the three dimensional structure of a protein. The three dimensional structure of protein is modeled in such a way facilitating interaction between potential drug target and various lead compounds, helping speedy drug discovery.
A lead compound is a small molecule that serves as the starting point of an optimization involving many small molecules that are closely related to the lead compound structure. Potential compounds are modeled computationally to estimate their "fit" to the target by computing a scoring function or an energy function.
Most algorithms consider both structural and functional interactions, such as steric fit, hydrogen bonding and hydrophobic interactions. The initial design phase is usually followed by the synthesis of the lead compound, target protein binding assays and co-crystallization of the compound and target for X-ray structural studies.
Empirical information regarding how the lead compound actually binds to the target drives the refinement of the lead compound and target-binding. The refined lead compound is then synthesizes with the target. It is further refined in a reiterative optimization process.
Structure-based drug design is often used in conjunction with combinatorial chemistry approaches. The process involves management of large databases of small molecules or combinatorial libraries. Combinatorial chemistry provides a high-throughput means of analyzing numerous compounds in the search for a good lead compound.
Benefits of sequencing of human genome
The successful completion of the sequencing of human genome will allow the identification of protein targets for the treatment of unraveled diseases. Proteins, in particular metalloproteins are usually target for drugs, because they are small molecules capable of interacting with specific receptor inhibiting its in vivo activity.
The new rational drug development is based on the knowledge of 3-D structure of the target proteins of interest. This great knowledge promises the foundation of designing molecules capable to bind the receptor so as to maximize the drug affinity and specificity towards the target. Using computational methods such as molecular docking and homology modeling and experimental methods such as NMR spectroscopy, X-ray crystallography and biological assays, it is possible to screen library of compounds towards protein targets and to solve the three-dimensional structure of a protein-inhibitor complex.
To be specific, such techniques allow designing and discovering of novel inhibitors with enhanced selectivity towards a particular receptor, limiting side effects and toxicity. It is possible to study protein targets both in the wild-type form and in the mutated form, the latter being highly important for the different response of individuals to a common drug treatment.
All marketed drugs today target only about 500 gene products. The elucidation of the human genome, which has an estimated 30,000 to 40,000 genes, presents immense new opportunities for drug discovery. It also simultaneously creates a potential bottleneck regarding the choice of targets to support the drug discovery pipeline. The major advances in genomics and sequencing means that finding an attractive target is no longer a problem but finding the targets that are most likely to succeed has become the challenge. The focus of bioinformatics in the drug discovery process has therefore shifted from target identification to target validation.
Designing a New Drug using Bioinformatics Tools
Computational techniques assist searching drug target and designing drug in silico, but it takes long time and money. In order to design a new drug one need to follow the following path.
.Identify Target Disease: One should know everything about the disease and existing or traditional remedies. Equally important is to look at very similar afflictions and their known treatments. Target identification alone is not sufficient in order to achieve a successful treatment of a disease. A real drug needs to be developed. This drug must influence the target protein in such a way that it does not interfere with normal metabolism. One way to achieve this is to block activity of the protein with a small molecule. Bioinformatics methods have been developed to virtually screen the target for compounds that bind and inhibit the protein. Another possibility is to find other proteins that regulate the activity of the target by binding and forming a complex.
.Study Interesting Compounds: One needs to identify and study the lead compounds that have some activity against a disease. These may be only marginally useful and may have severe side effects. These compounds provide a starting point for refinement of the chemical structures.
.Detect the Molecular Basis for Disease: If one knows that a drug must bind to a particular spot on a particular protein or nucleotide then a drug can be edited to bind at that site. This is often modeled computationally using any of several different techniques. Traditionally, the primary way of determining what compounds would be tested
.computationally was provided by the researchers' understanding of molecular interactions. A second method is the testing of large numbers of compounds from a database of available structures.
.Rational drug design techniques: These techniques facilitates to reproduce the researchers' understanding of how to choose likely compounds built into a software package that is capable of modeling a very large number of compounds in an automated way. Many different algorithms have been used for this type of testing, many of which were adapted from artificial intelligence applications. The complexity of biological systems makes it very difficult to determine the structures of large biomolecules. Ideally experimentally determined (X-ray or NMR) structure is desired, but biomolecules are very difficult to crystallize.
.Refinement of compounds: When a large number of lead compounds have been found, computational and laboratory techniques can refine the molecular structures to give a greater drug activity and fewer side effects. This is done both in the laboratory as well as computationally by examining the molecular structures to determine which aspects are responsible for both the drug activity and the side effects.
.Quantitative Structure Activity Relationships (QSAR): QSAR should be used to detect the functional group in the compound in order to refine the drug. This computational technique consists of computing every possible number that can describe a molecule then doing an enormous curve fit to find out which aspects of the molecule correlate well with the drug activity or side effect severity. This information can then be used to suggest new chemical modifications for synthesis and testing.
.Solubility of Molecule: It is necessary to check whether the target molecule is water soluble or readily soluble in fatty tissue or will affect what part of the body it becomes concentrated in. The ability to get a drug to the correct part of the body is an important factor in its potency. There is a continual exchange of information between the researchers doing QSAR studies, synthesis and testing. These techniques are frequently used and often very successful since they do not rely on knowing the biological basis of the disease which can be very difficult to determine.
.Drug Testing: Finally, when a drug has been shown to be effective by an initial assay technique, much more testing must be done before it can be given to human patients. Animal testing is the primary type of testing at this stage. Eventually, the compounds, which are deemed fit at this stage, are sent on to clinical trials. In the clinical trials, additional side effects may be found and human dosages are determined.
Bioinformatics in Computer-Aided Drug Design
Computer-Aided Drug Design (CADD) is a specialized field that uses computational methods to simulate drug-receptor interactions. CADD methods are heavily dependent on bioinformatics tools, applications and databases. As such, there is considerable overlap in CADD research and bioinformatics.
Basically, bioinformatics can be regarded as a central hub that unites several disciplines and methodologies.
On the support side of the hub, Information Technology, software applications and databases, all provide the infrastructure for bioinformatics. On the scientific side of the hub, bioinformatics methods are used extensively in molecular biology, genomics and proteomics. It is also used in other emerging areas, including meta-bolomics and transcriptomics and in CADD research.
How Bioinformatics supports CADD research?
There are several key areas where bioinformatics supports CADD research.
Virtual High-Throughput Screening (vHTS): In vHTS, protein targets are screened against databases of small-molecule compounds to see which molecules bind strongly to the target. If there is a "hit" with a particular compound, it can be extracted from the database for further testing. With today's computational resources, several million compounds can be screened in a few days on sufficiently large clustered computers. Pursuing a handful of promising leads for further development can save researchers considerable time and expense. ZINC is a good example of a vHTS compound library.
Sequence Analysis: In CADD research, one often knows the genetic sequence of multiple organisms or the amino acid sequence of proteins from several species. It is very useful to determine how similar or dissimilar the organisms are based on gene or protein sequences. With this information one can infer the evolutionary relationships of the organisms, search for similar sequences in bioinformatics databases and find related species to those under investigation. There are many bioinformatics sequence analysis tools that can be used to determine the level of sequence similarity.
Tools such as Transeq can help determine the protein coding regions of a DNA sequence. ClustalW is used to align DNA or protein sequences in order to elucidate their relatedness as well as their evolutionary origin.
Homology Modeling: One great challenge in CADD research is determining the 3-D structure of proteins. Most drug targets are proteins, so it's important to know their 3-D structure in detail. It's estimated that the human body has 500,000 to 1 million proteins. However, the 3-D structure is known for only a small fraction of these. Homology modeling is one method used to predict 3-D structure. In homology modeling, the amino acid sequence of a specific protein (target) is known, and the 3-D structures of proteins related to the target (templates) are known. Bioinformatics software tools are then used to predict the 3-D structure of the target based on the known 3-D structures of the templates. MODELLER is a well-known tool in homology modeling, and the SWISS-MODEL Repository is a database of protein structures created with homology modeling.
Similarity Searches: All the biopharmaceutical companies are engaged in searching for drug analogues. Starting with a promising drug molecule, one can search for chemical compounds with similar structure or properties to a known compound. There are a variety of methods used in these searches, including sequence similarity, 2D and 3D shape similarity, substructure similarity, electrostatic similarity and others. A variety of bioinformatics tools and search engines are available for this work.
Drug Lead Optimization: When a promising lead candidate has been found in a drug discovery program, need to optimize the structure and properties of the potential drug arises. This usually involves a series of modifications to the primary structure (scaffold) and secondary structure (moieties) of the compound. This process can be enhanced using software tools that explore related compounds (bioisosteres) to the lead candidate. Lead optimization tools such as WABE offer a rational approach to drug design that can reduce the time and expense of searching for related compounds.
Physicochemical Modeling: Drug-receptor interactions occur on atomic scales. To better understand how and why drug compounds bind to protein targets, one must consider the biochemical and biophysical properties of both the drug and their target at an atomic level. Swiss-PDB can predict key physicochemical properties, such as hydrophobicity and polarity that have a profound influence on how drugs bind to proteins.
Drug Bioavailability and Bioactivity: Usually, most drug candidates fail in Phase III clinical trials after many years of research and millions of dollars have been spent on them. And most fail because of toxicity or problems with metabolism. The key characteristics for drugs are Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) and efficacy-in other words bioavailability and bioactivity. Although these properties are usually measured in the lab, they can also be predicted in advance with bioinformatics software.
Advantages of CADD
CADD methods and bioinformatics tools offer significant benefits for drug discovery programs.
Cost Savings: Many biopharmaceutical companies now use computational methods and bioinformatics tools to reduce this cost burden. Virtual screening, lead optimization and predictions of bioavailability and bioactivity can help guide experimental research. Only the most promising experimental lines of inquiry can be followed and experimental dead-ends can be avoided early based on the results of CADD simulations.
Time-to-Market: The predictive power of CADD can help drug research programs choose only the most promising drug candidates. By focusing drug research on specific lead candidates and avoiding potential "dead-end" compounds, biopharmaceutical companies can get drugs to market more quickly.
Bioinformatics Works behind the Scenes
New malaria enzyme laid bare with help of computer calculations.
Recently, solely depending on computer, a research team at Uppsala University, Center for Structural Biology, Medical Chemistry, and Computer Chemistry, directed by Professor Alwyn Jones, in Sweden has managed to reveal both the structure and the function of a newly discovered enzyme from the most dangerous malaria parasite, Plasmodium falciparum. All that was needed was the amino acid sequence of the enzyme. The findings may represent a breakthrough for future pharmaceutical research. The aim is to develop drugs for some of the most severe and widely spread diseases in the world, such as malaria and TB. The results, which recently came out in the journal Biochemistry, are the work of Professor Johan Åqvist and doctoral student Sinisa Bjelic.
"The enzyme we studied is a new type, with previously unknown catalyst groups. This made it especially interesting as a target molecule for new drugs. Using only computer calculations, we succeeded in revealing both what it looks like and how it functions. It's the first time anybody ever did that," says Johan Åqvist. They started by comparing the enzyme's amino acid sequence with other known sequences. Then they ran computer simulations of how it might move in order to find possible structures, after which they looked at plausible combinations for how a substrate, a small peptide, might stick to the enzyme. In this way it was possible to predict the structure of the enzyme, how the substrate bonds, and the mechanism and rapidity of the chemical reaction.
(The author is with Accure Labs Pvt. Ltd., New Delhi)